fix(watch): purge embeddings before nodes to stop FK crash in rebuildFile#1182
Conversation
…File The watcher's `rebuildFile` had its own `purgeAncillaryData` helper that missed the `embeddings` table. With `codegraph embed` populated and better-sqlite3's default `foreign_keys = ON`, the next `deleteNodes` fired `SQLITE_CONSTRAINT_FOREIGNKEY` (embeddings.node_id → nodes.id) and killed the watcher process. Switch to the shared `purgeFileData` helper used by the main builder — it purges embeddings, cfg_*, dataflow, complexity, node_metrics, ast_nodes, edges, and nodes in the correct order — and drop the now unused stmt prep. Wrap per-file `rebuildFile` calls in try/catch so any future SQLite or parse error logs+skips instead of crashing the long-running watcher. Closes #1176
|
Claude finished @carlos-alm's task —— View job Reviewing PR #1182: Foreign key constraint fix in watcher Todo List
Phase 1: Root Cause Analysis ✓Problem Being Solved: The watcher's Why This Approach: The fix correctly switches from the watcher's custom purge logic to the shared Risk Assessment: Low risk. This is a defensive fix that prevents crashes and uses existing, well-tested purge logic rather than duplicating/maintaining separate deletion code. Phase 2: Configuration Change Detection ✓No configuration changes detected. This PR makes no changes to:
Phase 3: Test Integrity Verification ✓New Regression Test:
Existing Tests: Test Quality: High-quality regression test that directly reproduces the user-reported failure scenario. Phase 4: Code Quality Review ✅Excellent Cleanup:
Code Quality: Very clean implementation with excellent comments explaining the FK ordering requirement and referencing issue #1176. No Security/Performance Concerns: Uses existing, vetted database operations. No command injection, SQL injection, or performance regressions. Phase 5: Backlog Compliance Assessment ✅Zero-dep: ✓ — No new runtime dependencies added. Uses existing SQLite operations. Foundation-aligned: ✓ — Fully compliant with FOUNDATION.md principles:
Problem-fit: 4/5 — Directly prevents watcher crashes that would disrupt AI agents during long-running sessions. While not directly addressing "lost AI" token waste, it ensures the core dependency tracking infrastructure remains stable and available. Breaking: No — This is a pure bug fix with no API changes. The Tier: 1 (Zero-dep ✓ + Foundation-aligned ✓) Root Cause AnalysisProblem Being Solved: Foreign key constraint violation crashes in watcher when embeddings exist Backlog Compliance
Critical ConcernsNone identified. This is a well-executed fix with excellent test coverage. Final Recommendation
|
Greptile SummaryThis PR fixes a foreign-key crash in the watcher's
Confidence Score: 5/5Safe to merge — the fix is a well-scoped replacement of a local purge helper with the already-battle-tested shared utility, and the new regression test directly reproduces the crash scenario. All changed code follows existing patterns, the FK ordering in No files require special attention. Important Files Changed
Sequence DiagramsequenceDiagram
participant FS as File System (watcher)
participant PP as processPendingFiles
participant RF as rebuildFile
participant PFD as purgeFileData
participant DB as SQLite DB
FS->>PP: file change detected
loop for each changed file
PP->>RF: rebuildFile(db, rootDir, filePath, stmts, ...)
activate RF
RF->>DB: findReverseDeps (read edges before purge)
RF->>PFD: "purgeFileData(db, relPath, { purgeHashes: false })"
activate PFD
PFD->>DB: DELETE FROM embeddings WHERE node_id IN (nodes for file)
PFD->>DB: DELETE FROM cfg_edges / cfg_blocks / dataflow / complexity / node_metrics / ast_nodes
PFD->>DB: DELETE FROM edges (source/target in file)
PFD->>DB: "DELETE FROM nodes WHERE file = ?"
deactivate PFD
RF->>RF: "parse file, insert new nodes & edges"
RF-->>PP: RebuildResult
deactivate RF
Note over PP: catch(err) → warn + debug, skip file
end
PP->>DB: writeJournalAndChangeEvents (batch)
Reviews (5): Last reviewed commit: "Merge branch 'main' into fix/1176-watch-..." | Re-trigger Greptile |
| warn(`Failed to rebuild ${relPath}: ${(err as Error).message} — skipping`); | ||
| debug((err as Error).stack ?? String(err)); |
There was a problem hiding this comment.
If
err is a non-Error value (e.g., a plain string or null thrown by a third-party dependency), (err as Error).message evaluates to undefined, so the logged warning reads "Failed to rebuild …: undefined — skipping", which is hard to diagnose. Narrowing with instanceof produces a safe message in all cases.
| warn(`Failed to rebuild ${relPath}: ${(err as Error).message} — skipping`); | |
| debug((err as Error).stack ?? String(err)); | |
| const message = err instanceof Error ? err.message : String(err); | |
| warn(`Failed to rebuild ${relPath}: ${message} — skipping`); | |
| debug(err instanceof Error ? (err.stack ?? message) : String(err)); |
There was a problem hiding this comment.
Fixed in d2db316 — replaced (err as Error).message with instanceof Error narrowing so a non-Error throw logs a usable message instead of "undefined".
| const remaining = db | ||
| .prepare( | ||
| `SELECT COUNT(*) AS c FROM embeddings e | ||
| JOIN nodes n ON e.node_id = n.id | ||
| WHERE n.file = ?`, | ||
| ) | ||
| .get('shared/constants.js') as { c: number }; | ||
| expect(remaining.c).toBe(0); |
There was a problem hiding this comment.
Secondary assertion does not verify the seeded row was deleted
The query JOIN nodes n ON e.node_id = n.id WHERE n.file = ? counts embeddings that reference currently existing nodes for shared/constants.js. After rebuildFile the old node IDs are gone, so even if the embedding row were left as an orphan the JOIN would find nothing and the count would still be 0. A more reliable check is to count all rows in embeddings (there was exactly one seeded row) or use WHERE node_id = ? with the known seeded ID, so the assertion actually fails if the row survives as an orphan.
There was a problem hiding this comment.
Fixed in d2db316 — switched the assertion to SELECT COUNT(*) FROM embeddings (exactly one row was seeded), so it now catches the orphan-row regression the JOIN check missed.
…1182) The watcher's local getNodeId.get used a typed parameter list (name, kind, file, line) but IncrementalStmts declares get: (...params: unknown[]). TS2322 rejected the narrower form, which broke the build job and all downstream CI checks. Switch to (...params: unknown[]) and destructure inside the body so the implementation matches the interface contract.
Addresses Greptile P2 review feedback: - src/domain/graph/watcher.ts: replace `(err as Error).message` with `instanceof Error` narrowing so a non-Error throw (plain string, null, or any value from a third-party dependency) logs a usable message instead of "undefined". - tests/integration/watcher-fk-embeddings.test.ts: count all rows in `embeddings` directly instead of joining on `nodes.file`. The JOIN-based check would pass even if the seeded row survived as an orphan with a dangling node_id; counting the table itself catches that regression.
Codegraph Impact Analysis3 functions changed → 7 callers affected across 2 files
|
Summary
rebuildFilerolled its ownpurgeAncillaryDatahelper that purgedcfg_*,dataflow,complexity,node_metrics, andast_nodes— but notembeddings. Withcodegraph embedpopulated and better-sqlite3's defaultPRAGMA foreign_keys = ON, the subsequentDELETE FROM nodesfiredSQLITE_CONSTRAINT_FOREIGNKEY(embeddings.node_id→nodes.id) and killed the long-running watcher process.rebuildFileto the sharedpurgeFileDatahelper already used by the main builder — it deletes from all child tables (includingembeddings) in the correct FK-safe order, thenedges, thennodes. Drop the now-unuseddeleteEdgesForFile/deleteNodes/countEdgesForFileplumbing fromIncrementalStmtsand the watcher's statement prep.rebuildFilecalls inprocessPendingFileswith try/catch so any future SQLite/parse/filesystem error logs and skips instead of crashing the watcher. The watcher is a long-running session — one bad file should not bring it down.Test plan
tests/integration/watcher-fk-embeddings.test.tsreproduces the failure (seeds anembeddingsrow, enablesPRAGMA foreign_keys = ON, callsrebuildFile) and asserts no FK violation + that the embedding row is purged alongside the rebuilt file's nodes.tests/integration/watcher-rebuild.test.tsstill passes with the slimmerIncrementalStmtsshape.npm test— 2732 passed, 24 skipped, 0 failed.npm run lint— clean.Closes #1176